Sean J. Birch

Graduated from Carnegie Mellon University’s top ranked Statistics and Economics programs with University and College Honors, while simultaneously performing research culminating in two senior thesis projects. In addition, I concurrently completed internship and obtained industry experience tailored to a career in data analytics where I made substantial contributions and gained in-depth experience with the entire data life-cycle.

This plot shows a border precinct matching algorithm for Allegheny County, Pennsylvania. The algorithm is generalized to run on any county.

This scatterplot shows a Principle Component Analysis on Supreme Court (SCOTUS) cases. When shifting from the 2022 to the 2023 Term, the horizontal axis becomes much more important and the court shifts from 3 clusters to 2.

This graph pairs Supreme Court (SCOTUS) justices to show how often they voted together for the 2022 and 2023 Terms.

This map shows all wind farms in the contiguous United States along with their custom 3-dimensional turbine density metrics, evaluating turbine wind interference.

This plot shows the decrease in debt service payments as a percent of disposable personal income from 2001 to 2020 as a factor of unemployment.

Visualization Honors

Statistical Graphics Visualization Team Competition 1st place award: 20 statistics professionals judging 26 teams.

Economics and Data Science Individual Visualization Challenge awarded 1st place of 67 entries: score of 98/100.

## Warning: package 'fpp2' was built under R version 4.1.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.5 --
## v ggplot2   3.5.1      v fma       2.5   
## v forecast  8.22.0     v expsmooth 2.3
## Warning: package 'fma' was built under R version 4.1.3
## Warning: package 'expsmooth' was built under R version 4.1.3
## 
## `geom_smooth()` using formula = 'y ~ x'

## 
## Call:
## tslm(formula = Consumption ~ Income, data = uschange)
## 
## Coefficients:
## (Intercept)       Income  
##      0.5451       0.2806
## 
## Call:
## tslm(formula = Consumption ~ Income + Production + Unemployment + 
##     Savings, data = uschange)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.88296 -0.17638 -0.03679  0.15251  1.20553 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.26729    0.03721   7.184 1.68e-11 ***
## Income        0.71449    0.04219  16.934  < 2e-16 ***
## Production    0.04589    0.02588   1.773   0.0778 .  
## Unemployment -0.20477    0.10550  -1.941   0.0538 .  
## Savings      -0.04527    0.00278 -16.287  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3286 on 182 degrees of freedom
## Multiple R-squared:  0.754,  Adjusted R-squared:  0.7486 
## F-statistic: 139.5 on 4 and 182 DF,  p-value: < 2.2e-16

## 
## Call:
## tslm(formula = aussies ~ guinearice)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.9448 -1.8917 -0.3272  1.8620 10.4210 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.493      1.203  -6.229 2.25e-07 ***
## guinearice    40.288      1.337  30.135  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.239 on 40 degrees of freedom
## Multiple R-squared:  0.9578, Adjusted R-squared:  0.9568 
## F-statistic: 908.1 on 1 and 40 DF,  p-value: < 2.2e-16

## 
## Call:
## tslm(formula = beer2 ~ trend + fourier(beer2, K = 2))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -42.903  -7.599  -0.459   7.991  21.789 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               446.87920    2.87321 155.533  < 2e-16 ***
## trend                      -0.34027    0.06657  -5.111 2.73e-06 ***
## fourier(beer2, K = 2)S1-4   8.91082    2.01125   4.430 3.45e-05 ***
## fourier(beer2, K = 2)C1-4  53.72807    2.01125  26.714  < 2e-16 ***
## fourier(beer2, K = 2)C2-4  13.98958    1.42256   9.834 9.26e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.23 on 69 degrees of freedom
## Multiple R-squared:  0.9243, Adjusted R-squared:  0.9199 
## F-statistic: 210.7 on 4 and 69 DF,  p-value: < 2.2e-16

The first argument to fourier() allows it to identify the seasonal period mm and the length of the predictors to return. The second argument K specifies how many pairs of sin and cos terms to include. The maximum allowed is K=m/2K=m/2 where mm is the seasonal period. Because we have used the maximum here, the results are identical to those obtained when using seasonal dummy variables.

##           CV          AIC         AICc          BIC        AdjR2 
##    0.1163477 -409.2980298 -408.8313631 -389.9113781    0.7485856